Indexing Incomplete Databases

نویسندگان

  • Guadalupe Canahuate
  • Michael Gibas
  • Hakan Ferhatosmanoglu
چکیده

Incomplete databases, that is, databases that are missing data, are present in many research domains. It is important to derive techniques to access these databases efficiently. We first show that known indexing techniques for multi-dimensional data search break down in terms of performance when indexed attributes contain missing data. This paper utilizes two popularly employed indexing techniques, bitmaps and quantization, to correctly and efficiently answer queries in the presence of missing data. Query execution and interval evaluation are formalized for the indexing structures based on whether missing data is considered to be a query match or not. The performance of Bitmap indexes and quantization based indexes is evaluated and compared over a variety of analysis parameters for real and synthetic data sets. Insights into the conditions for which to use each technique are provided.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast High-Dimensional Data Search in Incomplete Databases

We propose and evaluate two indexing schemes for improving the efficiency of data retrieval in high-dimensional databases that are incomplete. These schemes are novel in that the search keys may contain missing attribute values. The first is a multi-dimensional index structure, called the Bitstring-augmented R-tree (BR-tree), whereas the second comprises a family of multiple one-dimensional one...

متن کامل

Incomplete evidence: the inadequacy of databases in tracing published adverse drug reactions in clinical trials

BACKGROUND We would expect information on adverse drug reactions in randomised clinical trials to be easily retrievable from specific searches of electronic databases. However, complete retrieval of such information may not be straightforward, for two reasons. First, not all clinical drug trials provide data on the frequency of adverse effects. Secondly, not all electronic records of trials inc...

متن کامل

Survey on Various Methods and Techniques for Searching Dimension in Incomplete Database

Now a days, dimension incomplete problem is fundamental research problem in multidimensional database. Information regarding the missing dimension posses great computational challenges. In multidimensional database similarity query problem occur with numerous application in database area such as, data mining, information retrieval etc. Due to various practical issues like remote data accessing ...

متن کامل

مقایسه ساختار اصطلاح نامه‌های پایگاه‌های اطلاعاتی Pubmed و Embase با استاندارد اصطلاحنامه نویسی سازمان ملی استانداردهای اطلاعاتی آمریکا و بررسی شیوه‌های نمایه سازی دو پایگاه

Introduction: According to mortality rates in Iran, cardiovascular diseases, neoplasms, perinatal mortality, and respiratory tract diseases were top rate mortality in 2003(1382). To reduce mortality rate, Iranian medical community need to know more about recent therapeutic regimens. Two main medical databases are Pubmed and Embase. Researching Pubmed and Embase indexing methods and comparing Me...

متن کامل

وضعیت بازیابی اطلاعات در دو پایگاه نمایه و نما و سنجش اثربخشی استفاده از واژگان کنترل ‌شده در نمایه‌سازی این دو پایگاه

Purpose: This study was carried out to determine the level of precision, recall, and searching time for “Nama” and “Namayeh” databases, as well as to find out which of the indexing tools (thesaurus and Dewey decimal classification) helps us more in improvement of information retrieval. Methodology: This study is an analytical survey in which the necessary data was collected by direct observati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006